small private model
How to transfer the knowledge from GPT3 to small private models
A bigger teacher model gives information to a smaller student model. Extracted Understanding can be better, sometimes even more excellent than the Details chosen by humans. What is the distillation of knowledge? Knowledge distillation is the process of moving information from a big model to a single, more manageable model that may be used in real-world applications. In essence, it is a kind of model compression.